Search CORE

29 research outputs found

Contributions to bias adjusted stepwise latent class modeling

Author: Bakk Zsuzsa
Publication venue: Ridderprint
Publication date: 01/01/2015
Field of study

Two-step estimation of latent trait models

Author: Bakk Zsuzsa
Kuha Jouni
Publication venue
Publication date: 28/03/2023
Field of study

We consider two-step estimation of latent variable models, in which just the measurement model is estimated in the first step and the measurement parameters are then fixed at their estimated values in the second step where the structural model is estimated. We show how this approach can be implemented for latent trait models (item response theory models) where the latent variables are continuous and their measurement indicators are categorical variables. The properties of two-step estimators are examined using simulation studies and applied examples. They perform well, and have attractive practical and conceptual properties compared to the alternative one-step and three-step approaches. These results are in line with previous findings for other families of latent variable models. This provides strong evidence that two-step estimation is a flexible and useful general method of estimation for different types of latent variable models.Comment: 39 pages, 2 figures, 17 table

arXiv.org e-Print Archive

Relating latent class membership to external variables: an overview

Author: Bakk Zsuzsa
Kuha Jouni
Publication venue: 'Wiley'
Publication date: 16/11/2020
Field of study

In this article we provide an overview of existing approaches for relating latent class membership to external variables of interest. We extend on the work of Nylund-Gibson et al. (Structural Equation Modeling: A Multidisciplinary Journal, 2019, 26, 967), who summarize models with distal outcomes by providing an overview of most recommended modeling options for models with covariates and larger models with multiple latent variables as well. We exemplify the modeling approaches using data from the General Social Survey for a model with a distal outcome where underlying model assumptions are violated, and a model with multiple latent variables. We discuss software availability and provide example syntax for the real data examples in Latent GOLD

Crossref

LSE Research Online

Leiden University Scholary Publications

Unraveling the Skillsets of Data Scientists: Text Mining Analysis of Dutch University Master Programs in Data Science and Artificial Intelligence

Author: Bakk Zsuzsa
Belfi Barbara
Mol Mathijs J.
Publication venue
Publication date: 23/10/2023
Field of study

The growing demand for data scientists in the global labor market and the Netherlands has led to a rise in data science and artificial intelligence (AI) master programs offered by universities. However, there is still a lack of clarity regarding the specific skillsets of data scientists. This study aims to address this issue by employing Correlated Topic Modeling (CTM) to analyse the content of 41 master programs offered by seven Dutch universities. We assess the differences and similarities in the core skills taught by these programs, determine the subject-specific and general nature of the skills, and provide a comparison between the different types of universities offering these programs. Our findings reveal that research, data processing, statistics and ethics are the predominant skills taught in Dutch data science and AI master programs, with general universities emphasizing research skills and technical universities focusing more on IT and electronic skills. This study contributes to a better understanding of the diverse skillsets of data scientists, which is essential for employers, universities, and prospective students

arXiv.org e-Print Archive

Two-step estimation of models between latent classes and external variables

Author: Bakk Zsuzsa
Kuha Jouni
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 17/11/2017
Field of study

We consider models which combine latent class measurement models for categorical latent variables with structural regression models for the relationships between the latent classes and observed explanatory and response variables. We propose a two-step method of estimating such models. In its first step the measurement model is estimated alone, and in the second step the parameters of this measurement model are held fixed when the structural model is estimated. Simulation studies and applied examples suggest that the two-step method is an attractive alternative to existing one-step and three-step methods. We derive estimated standard errors for the two-step estimates of the structural model which account for the uncertainty from both steps of the estimation, and show how the method can be implemented in existing software for latent variable modellin

LSE Research Online

Leiden University Scholary Publications

Multilevel latent class analysis with covariates: Analysis of cross-national citizenship norms with a two-stage approach

Author: Bakk Zsuzsa
Di Mari Roberto
Kuha Jouni
Oser Jennifer
Publication venue
Publication date: 20/07/2023
Field of study

This paper focuses on the substantive application of multilevel LCA to the evolution of citizenship norms in a diverse array of democratic countries. To do so, we present a two-stage approach to fit multilevel latent class models: in the first stage (measurement model construction), unconditional class enumeration is done separately on both low and high level latent variables, estimating only a part of the model at a time -- hence keeping the remaining part fixed -- and then updating the full measurement model; in the second stage (structural model construction), individual and/or group covariates are included in the model. By separating the two parts -- first stage and second stage of model building -- the measurement model is stabilized and is allowed to be determined only by it's indicators. Moreover, this two-step approach makes the inclusion/exclusion of a covariate a relatively simple task to handle. Our proposal amends common practice in applied social science research, where simple (low-level) LCA is done to obtain a classification of low-level unit, and this is then related to (low- and high-level) covariates simply including group fixed effects. Our analysis identifies latent classes that score either consistently high or consistently low on all measured items, along with two theoretically important classes that place distinctive emphasis on items related to engaged citizenship, and duty-based norms

arXiv.org e-Print Archive

A two-step estimator for multilevel latent class analysis with covariates

Author: Bakk Zsuzsa
Di Mari Roberto
Kuha Jouni
Oser Jennifer
Publication venue
Publication date: 05/07/2023
Field of study

We propose a two-step estimator for multilevel latent class analysis (LCA) with covariates. The measurement model for observed items is estimated in its first step, and in the second step covariates are added in the model, keeping the measurement model parameters fixed. We discuss model identification, and derive an Expectation Maximization algorithm for efficient implementation of the estimator. By means of an extensive simulation study we show that (i) this approach performs similarly to existing stepwise estimators for multilevel LCA but with much reduced computing time, and (ii) it yields approximately unbiased parameter estimates with a negligible loss of efficiency compared to the one-step estimator. The proposal is illustrated with a cross-national analysis of predictors of citizenship norms.Comment: Manuscript version accepted for publication in Psychometrik

arXiv.org e-Print Archive

A two-step estimator for multilevel latent class analysis with covariates

Author: Bakk Zsuzsa
Di Mari Roberto
Kuha Jouni
Oser Jennifer
Publication venue
Publication date: 06/08/2023
Field of study

We propose a two-step estimator for multilevel latent class analysis (LCA) with covariates. The measurement model for observed items is estimated in its first step, and in the second step covariates are added in the model, keeping the measurement model parameters fixed. We discuss model identification, and derive an Expectation Maximization algorithm for efficient implementation of the estimator. By means of an extensive simulation study we show that (1) this approach performs similarly to existing stepwise estimators for multilevel LCA but with much reduced computing time, and (2) it yields approximately unbiased parameter estimates with a negligible loss of efficiency compared to the one-step estimator. The proposal is illustrated with a cross-national analysis of predictors of citizenship norms

LSE Research Online

Modeling predictors of latent classes in regression mixture models

Author: Bakk Zsuzsa
Jaki Thomas Friedrich
Kim Minjung
Van Horn M. Lee
Vermunt Joeren
Publication venue: 'Informa UK Limited'
Publication date: 01/06/2016
Field of study

The purpose of this study is to provide guidance on a process for including latent class predictors in regression mixture models. We first examine the performance of current practice for using the 1-step and 3-step approaches where the direct covariate effect on the outcome is omitted. None of the approaches show adequate estimates of model parameters. Given that Step 1 of the 3-step approach shows adequate results in class enumeration, we suggest using an alternative approach: (a) decide the number of latent classes without predictors of latent classes, and (b) bring the latent class predictors into the model with the inclusion of hypothesized direct covariate effects. Our simulations show that this approach leads to good estimates for all model parameters. The proposed approach is demonstrated by using empirical data to examine the differential effects of family resources on students’ academic achievement outcome. Implications of the study are discussed

Lancaster E-Prints